首页> 外文OA文献 >Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization
【2h】

Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

机译:虚拟与真实:交易模拟和物理实验   贝叶斯优化的强化学习

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In practice, the parameters of control policies are often tuned manually.This is time-consuming and frustrating. Reinforcement learning is a promisingalternative that aims to automate this process, yet often requires too manyexperiments to be practical. In this paper, we propose a solution to thisproblem by exploiting prior knowledge from simulations, which are readilyavailable for most robotic platforms. Specifically, we extend Entropy Search, aBayesian optimization algorithm that maximizes information gain from eachexperiment, to the case of multiple information sources. The result is aprincipled way to automatically combine cheap, but inaccurate information fromsimulations with expensive and accurate physical experiments in acost-effective manner. We apply the resulting method to a cart-pole system,which confirms that the algorithm can find good control policies with fewerexperiments than standard Bayesian optimization on the physical system only.
机译:实际上,控制策略的参数通常是手动调整的,这既耗时又令人沮丧。强化学习是一种有前途的选择,旨在使这一过程自动化,但通常需要太多的实验才能实际应用。在本文中,我们通过利用仿真中的先验知识为大多数机器人平台提供的现有知识,提出了解决该问题的方案。具体来说,我们将熵搜索(一种贝叶斯优化算法)扩展到多个信息源的情况,该算法使每个实验的信息获取最大化。结果是一种以经济有效的方式自动将廉价但不准确的模拟信息与昂贵而精确的物理实验相结合的方法。我们将得到的方法应用于小车极点系统,这证实了该算法仅在物理系统上就能以比标准贝叶斯优化更少的实验找到良好的控制策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号